Overview

Dataset statistics

Number of variables20
Number of observations428
Missing cells88
Missing cells (%)1.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory67.0 KiB
Average record size in memory160.3 B

Variable types

NUM11
BOOL8
CAT1

Warnings

vehicle_name has a high cardinality: 425 distinct values High cardinality
dealer_cost is highly correlated with retail_priceHigh correlation
retail_price is highly correlated with dealer_costHigh correlation
hwy_mpg is highly correlated with city_mpgHigh correlation
city_mpg is highly correlated with hwy_mpgHigh correlation
city_mpg has 15 (3.5%) missing values Missing
hwy_mpg has 15 (3.5%) missing values Missing
len has 26 (6.1%) missing values Missing
width has 28 (6.5%) missing values Missing
vehicle_name is uniformly distributed Uniform

Reproduction

Analysis started2020-12-04 12:48:40.516332
Analysis finished2020-12-04 12:49:03.797543
Duration23.28 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

vehicle_name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct425
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
Mercedes-Benz C320 4dr
 
2
Infiniti G35 4dr
 
2
Mercedes-Benz C240 4dr
 
2
Ford F-150 Regular Cab XL
 
1
Kia Amanti 4dr
 
1
Other values (420)
420 
ValueCountFrequency (%) 
Mercedes-Benz C320 4dr20.5%
 
Infiniti G35 4dr20.5%
 
Mercedes-Benz C240 4dr20.5%
 
Ford F-150 Regular Cab XL10.2%
 
Kia Amanti 4dr10.2%
 
Honda Pilot LX10.2%
 
Mazda B4000 SE Cab Plus10.2%
 
Oldsmobile Silhouette GL10.2%
 
Toyota Corolla S 4dr10.2%
 
Saab 9-5 Aero10.2%
 
Other values (415)41597.0%
 
2020-12-04T12:49:03.958139image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique422 ?
Unique (%)98.6%
2020-12-04T12:49:04.145632image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length45
Median length21
Mean length21.94392523
Min length8
Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
1
244 
0
184 
ValueCountFrequency (%) 
124457.0%
 
018443.0%
 
2020-12-04T12:49:04.252321image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

sports_car
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
379 
1
49 
ValueCountFrequency (%) 
037988.6%
 
14911.4%
 
2020-12-04T12:49:04.307174image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

suv
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
368 
1
60 
ValueCountFrequency (%) 
036886.0%
 
16014.0%
 
2020-12-04T12:49:04.365021image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

wagon
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
398 
1
 
30
ValueCountFrequency (%) 
039893.0%
 
1307.0%
 
2020-12-04T12:49:04.420902image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

minivan
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
408 
1
 
20
ValueCountFrequency (%) 
040895.3%
 
1204.7%
 
2020-12-04T12:49:04.466774image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

pickup
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
404 
1
 
24
ValueCountFrequency (%) 
040494.4%
 
1245.6%
 
2020-12-04T12:49:04.519606image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

awd
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
336 
1
92 
ValueCountFrequency (%) 
033678.5%
 
19221.5%
 
2020-12-04T12:49:04.573495image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

rwd
Boolean

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
318 
1
110 
ValueCountFrequency (%) 
031874.3%
 
111025.7%
 
2020-12-04T12:49:04.627325image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

retail_price
Real number (ℝ≥0)

HIGH CORRELATION

Distinct410
Distinct (%)95.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32774.85514
Minimum10280
Maximum192465
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:49:04.740015image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum10280
5-th percentile13691
Q120334.25
median27635
Q339205
95-th percentile72864.25
Maximum192465
Range182185
Interquartile range (IQR)18870.75

Descriptive statistics

Standard deviation19431.71667
Coefficient of variation (CV)0.5928848988
Kurtosis13.87920552
Mean32774.85514
Median Absolute Deviation (MAD)8314
Skewness2.798099275
Sum14027638
Variance377591612.9
MonotocityNot monotonic
2020-12-04T12:49:04.925547image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1986020.5%
 
1538920.5%
 
3154520.5%
 
3399520.5%
 
1963520.5%
 
2159520.5%
 
4999520.5%
 
2999520.5%
 
3594020.5%
 
2570020.5%
 
Other values (400)40895.3%
 
ValueCountFrequency (%) 
1028010.2%
 
1053910.2%
 
1076010.2%
 
1099510.2%
 
1115510.2%
 
ValueCountFrequency (%) 
19246510.2%
 
12842010.2%
 
12667010.2%
 
12177010.2%
 
9482010.2%
 

dealer_cost
Real number (ℝ≥0)

HIGH CORRELATION

Distinct425
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30014.70093
Minimum9875
Maximum173560
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:49:05.111059image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum9875
5-th percentile12836.65
Q118866
median25294.5
Q335710.25
95-th percentile66471.95
Maximum173560
Range163685
Interquartile range (IQR)16844.25

Descriptive statistics

Standard deviation17642.11775
Coefficient of variation (CV)0.5877825599
Kurtosis13.94616377
Mean30014.70093
Median Absolute Deviation (MAD)7531
Skewness2.834740404
Sum12846292
Variance311244318.7
MonotocityNot monotonic
2020-12-04T12:49:05.293535image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1963820.5%
 
1420720.5%
 
6830620.5%
 
3788610.2%
 
2492610.2%
 
2119810.2%
 
2388310.2%
 
2490910.2%
 
1365010.2%
 
2491510.2%
 
Other values (415)41597.0%
 
ValueCountFrequency (%) 
987510.2%
 
1010710.2%
 
1014410.2%
 
1031910.2%
 
1064210.2%
 
ValueCountFrequency (%) 
17356010.2%
 
11960010.2%
 
11785410.2%
 
11338810.2%
 
8832410.2%
 

engine_size_(l)
Real number (ℝ≥0)

Distinct43
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.196728972
Minimum1.3
Maximum8.3
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:49:05.475056image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1.3
5-th percentile1.7
Q12.375
median3
Q33.9
95-th percentile5.3
Maximum8.3
Range7
Interquartile range (IQR)1.525

Descriptive statistics

Standard deviation1.108594718
Coefficient of variation (CV)0.3467903373
Kurtosis0.5419435378
Mean3.196728972
Median Absolute Deviation (MAD)0.8
Skewness0.7081519825
Sum1368.2
Variance1.22898225
MonotocityNot monotonic
2020-12-04T12:49:05.660555image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=43)
ValueCountFrequency (%) 
3429.8%
 
3.5347.9%
 
2307.0%
 
2.5266.1%
 
2.4235.4%
 
1.8235.4%
 
4.6214.9%
 
4.2204.7%
 
3.2184.2%
 
3.8174.0%
 
Other values (33)17440.7%
 
ValueCountFrequency (%) 
1.320.5%
 
1.410.2%
 
1.561.4%
 
1.6102.3%
 
1.740.9%
 
ValueCountFrequency (%) 
8.310.2%
 
6.810.2%
 
661.4%
 
5.730.7%
 
5.620.5%
 

cyl
Real number (ℝ)

Distinct8
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.775700935
Minimum-1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:49:05.816138image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile4
Q14
median6
Q36
95-th percentile8
Maximum12
Range13
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.622779362
Coefficient of variation (CV)0.2809666532
Kurtosis1.396548909
Mean5.775700935
Median Absolute Deviation (MAD)2
Skewness0.2342651493
Sum2472
Variance2.633412856
MonotocityNot monotonic
2020-12-04T12:49:05.959755image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%) 
619044.4%
 
413631.8%
 
88720.3%
 
571.6%
 
1230.7%
 
1020.5%
 
-120.5%
 
310.2%
 
ValueCountFrequency (%) 
-120.5%
 
310.2%
 
413631.8%
 
571.6%
 
619044.4%
 
ValueCountFrequency (%) 
1230.7%
 
1020.5%
 
88720.3%
 
619044.4%
 
571.6%
 

hp
Real number (ℝ≥0)

Distinct110
Distinct (%)25.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean215.885514
Minimum73
Maximum500
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:49:06.125314image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum73
5-th percentile115
Q1165
median210
Q3255
95-th percentile338.25
Maximum500
Range427
Interquartile range (IQR)90

Descriptive statistics

Standard deviation71.83603158
Coefficient of variation (CV)0.3327505873
Kurtosis1.552158629
Mean215.885514
Median Absolute Deviation (MAD)45
Skewness0.9303307363
Sum92399
Variance5160.415434
MonotocityNot monotonic
2020-12-04T12:49:06.306835image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
200174.0%
 
210143.3%
 
215143.3%
 
225133.0%
 
240133.0%
 
220122.8%
 
140122.8%
 
300112.6%
 
170112.6%
 
130102.3%
 
Other values (100)30170.3%
 
ValueCountFrequency (%) 
7310.2%
 
9310.2%
 
10010.2%
 
10351.2%
 
10430.7%
 
ValueCountFrequency (%) 
50010.2%
 
49330.7%
 
47710.2%
 
45010.2%
 
42010.2%
 

city_mpg
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct28
Distinct (%)6.8%
Missing15
Missing (%)3.5%
Infinite0
Infinite (%)0.0%
Mean20.08958838
Minimum10
Maximum60
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:49:06.468397image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile14
Q117
median19
Q321
95-th percentile29
Maximum60
Range50
Interquartile range (IQR)4

Descriptive statistics

Standard deviation5.219382573
Coefficient of variation (CV)0.2598053516
Kurtosis16.61615357
Mean20.08958838
Median Absolute Deviation (MAD)2
Skewness2.928557209
Sum8297
Variance27.24195444
MonotocityNot monotonic
2020-12-04T12:49:06.617996image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%) 
186815.9%
 
205613.1%
 
17409.3%
 
21388.9%
 
19388.9%
 
16296.8%
 
24225.1%
 
26214.9%
 
22184.2%
 
15174.0%
 
Other values (18)6615.4%
 
(Missing)153.5%
 
ValueCountFrequency (%) 
1010.2%
 
1220.5%
 
13112.6%
 
14133.0%
 
15174.0%
 
ValueCountFrequency (%) 
6010.2%
 
5910.2%
 
4610.2%
 
3810.2%
 
3610.2%
 

hwy_mpg
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct32
Distinct (%)7.7%
Missing15
Missing (%)3.5%
Infinite0
Infinite (%)0.0%
Mean26.90556901
Minimum12
Maximum66
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:49:06.744684image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile18
Q124
median26
Q329
95-th percentile36
Maximum66
Range54
Interquartile range (IQR)5

Descriptive statistics

Standard deviation5.70371136
Coefficient of variation (CV)0.2119899921
Kurtosis6.425357238
Mean26.90556901
Median Absolute Deviation (MAD)3
Skewness1.350295982
Sum11112
Variance32.53232328
MonotocityNot monotonic
2020-12-04T12:49:07.153564image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%) 
265412.6%
 
254310.0%
 
28388.9%
 
29347.9%
 
27276.3%
 
24266.1%
 
30235.4%
 
23163.7%
 
21163.7%
 
19153.5%
 
Other values (22)12128.3%
 
(Missing)153.5%
 
ValueCountFrequency (%) 
1210.2%
 
1410.2%
 
1620.5%
 
1792.1%
 
1892.1%
 
ValueCountFrequency (%) 
6610.2%
 
5120.5%
 
4610.2%
 
4410.2%
 
4320.5%
 

weight
Real number (ℝ≥0)

Distinct347
Distinct (%)81.5%
Missing2
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean3577.213615
Minimum1850
Maximum7190
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:49:07.307151image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1850
5-th percentile2513
Q13102
median3474.5
Q33974.25
95-th percentile4996.75
Maximum7190
Range5340
Interquartile range (IQR)872.25

Descriptive statistics

Standard deviation760.4376628
Coefficient of variation (CV)0.2125782088
Kurtosis1.678289561
Mean3577.213615
Median Absolute Deviation (MAD)428
Skewness0.8933847105
Sum1523893
Variance578265.439
MonotocityNot monotonic
2020-12-04T12:49:07.477697image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
328540.9%
 
317540.9%
 
347030.7%
 
402430.7%
 
321730.7%
 
267630.7%
 
405230.7%
 
269230.7%
 
343030.7%
 
342830.7%
 
Other values (337)39492.1%
 
ValueCountFrequency (%) 
185010.2%
 
203510.2%
 
205510.2%
 
208510.2%
 
219510.2%
 
ValueCountFrequency (%) 
719010.2%
 
640010.2%
 
613310.2%
 
596910.2%
 
587910.2%
 

wheel_base
Real number (ℝ≥0)

Distinct40
Distinct (%)9.4%
Missing2
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean108.1737089
Minimum89
Maximum144
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:49:07.633281image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum89
5-th percentile95.25
Q1103
median107
Q3112
95-th percentile123
Maximum144
Range55
Interquartile range (IQR)9

Descriptive statistics

Standard deviation8.326449076
Coefficient of variation (CV)0.07697294619
Kurtosis2.112464038
Mean108.1737089
Median Absolute Deviation (MAD)5
Skewness0.9552742051
Sum46082
Variance69.32975421
MonotocityNot monotonic
2020-12-04T12:49:07.770912image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%) 
1074510.5%
 
103307.0%
 
106276.3%
 
112255.8%
 
104225.1%
 
105214.9%
 
115204.7%
 
111174.0%
 
109174.0%
 
101163.7%
 
Other values (30)18643.5%
 
ValueCountFrequency (%) 
8920.5%
 
9392.1%
 
95112.6%
 
9651.2%
 
9730.7%
 
ValueCountFrequency (%) 
14420.5%
 
14010.2%
 
13710.2%
 
13320.5%
 
13110.2%
 

len
Real number (ℝ≥0)

MISSING

Distinct61
Distinct (%)15.2%
Missing26
Missing (%)6.1%
Infinite0
Infinite (%)0.0%
Mean185.1268657
Minimum143
Maximum227
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:49:07.931484image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum143
5-th percentile162.05
Q1177
median186
Q3193
95-th percentile207
Maximum227
Range84
Interquartile range (IQR)16

Descriptive statistics

Standard deviation13.31252292
Coefficient of variation (CV)0.07191027013
Kurtosis0.3112242615
Mean185.1268657
Median Absolute Deviation (MAD)8
Skewness-0.09622117289
Sum74421
Variance177.2232665
MonotocityNot monotonic
2020-12-04T12:49:08.093077image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
178266.1%
 
190225.1%
 
187174.0%
 
192143.3%
 
200133.0%
 
177133.0%
 
188133.0%
 
179133.0%
 
175122.8%
 
183122.8%
 
Other values (51)24757.7%
 
(Missing)266.1%
 
ValueCountFrequency (%) 
14310.2%
 
14410.2%
 
15010.2%
 
15320.5%
 
15410.2%
 
ValueCountFrequency (%) 
22710.2%
 
22110.2%
 
21920.5%
 
21520.5%
 
21271.6%
 

width
Real number (ℝ≥0)

MISSING

Distinct18
Distinct (%)4.5%
Missing28
Missing (%)6.5%
Infinite0
Infinite (%)0.0%
Mean71.2925
Minimum64
Maximum81
Zeros0
Zeros (%)0.0%
Memory size3.3 KiB
2020-12-04T12:49:08.234672image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum64
5-th percentile67
Q169
median71
Q373
95-th percentile78
Maximum81
Range17
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.393483915
Coefficient of variation (CV)0.04759945177
Kurtosis-0.2123582009
Mean71.2925
Median Absolute Deviation (MAD)2
Skewness0.5607116725
Sum28517
Variance11.51573308
MonotocityNot monotonic
2020-12-04T12:49:08.356346image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%) 
725512.9%
 
684711.0%
 
69429.8%
 
73429.8%
 
70419.6%
 
71368.4%
 
67368.4%
 
74214.9%
 
75174.0%
 
78163.7%
 
Other values (8)4711.0%
 
(Missing)286.5%
 
ValueCountFrequency (%) 
6410.2%
 
6530.7%
 
66102.3%
 
67368.4%
 
684711.0%
 
ValueCountFrequency (%) 
8110.2%
 
8020.5%
 
79122.8%
 
78163.7%
 
7761.4%
 

Interactions

2020-12-04T12:48:45.875005image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:46.023610image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:46.174205image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:46.315853image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:46.453460image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:46.593092image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:46.736714image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:46.877326image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:47.032937image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:47.174531image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:47.329117image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:47.478720image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:47.631311image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:47.774954image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:47.908597image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:48.049225image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:48.186823image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:48.329476image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:48.472097image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:48.608729image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:48.739348image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:48.892969image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:49.034556image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:49.174218image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:49.305834image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:49.440479image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:49.587086image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:49.716733image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:49.846427image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:49.972085image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:50.092730image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:50.222381image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:50.359048image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:50.486710image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:50.621351image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:50.749007image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:50.863700image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:50.975406image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:51.093055image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:51.214755image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:51.333412image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:51.451124image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:51.567811image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:51.696467image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:51.965756image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:52.101369image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:52.234036image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:52.354681image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:52.475359image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:52.593070image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:52.711759image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:52.828448image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:52.945128image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:53.059823image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:53.189475image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:53.315146image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:53.445791image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:53.578410image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:53.698088image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:53.817770image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:53.937448image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:54.060148image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:54.184822image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:54.310451image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:54.435145image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:54.570756image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:54.701440image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:54.848040image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:54.981656image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:55.106324image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:55.232015image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:55.369633image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:55.500305image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:55.626932image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:55.752628image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:55.879259image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:56.017918image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:56.147575image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:56.284207image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:56.413828image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:56.537498image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:56.658175image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:56.781845image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:56.902553image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:57.023198image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:57.152868image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:57.275556image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:57.417180image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:57.551239image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:57.697879image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:57.837503image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:57.957153image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:58.076860image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:58.201498image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:58.324201image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:58.441857image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:58.560539image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:58.680220image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:58.810870image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:58.952492image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:59.283606image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:59.432209image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:59.579813image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:59.714454image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:59.848128image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:48:59.988755image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:00.131372image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:00.268997image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:00.408599image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:00.560219image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:00.725750image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:00.871392image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:01.017970image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:01.166572image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:01.301241image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:01.463777image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:01.625344image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:01.788939image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:01.931525image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:02.081127image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:02.236709image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2020-12-04T12:49:08.511963image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-04T12:49:08.824096image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-04T12:49:09.148231image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-04T12:49:09.460424image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-12-04T12:49:02.612705image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:03.123386image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:03.402625image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-04T12:49:03.592113image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Sample

First rows

vehicle_namesmallsporty_compactlarge_sedansports_carsuvwagonminivanpickupawdrwdretail_pricedealer_costengine_size_(l)cylhpcity_mpghwy_mpgweightwheel_baselenwidth
0Acura 3.5 RL 4dr1000000043755390143.5622518.024.03880.0115.0197.072.0
1Acura 3.5 RL w/Navigation 4dr1000000046100411003.5622518.024.03893.0115.0197.072.0
2Acura MDX0010001036945333373.5626517.023.04451.0106.0189.077.0
3Acura NSX coupe 2dr manual S0100000189765799783.2629017.024.03153.0100.0174.071.0
4Acura RSX Type S 2dr1000000023820217612.0420024.031.02778.0101.0172.068.0
5Acura TL 4dr1000000033195302993.2627020.028.03575.0108.0186.072.0
6Acura TSX 4dr1000000026990246472.4420022.029.03230.0105.0183.069.0
7Audi A4 1.8T 4dr1000000025940235081.8417022.031.03252.0104.0179.070.0
8Audi A4 3.0 4dr1000000031840288463.0622020.028.03462.0104.0179.070.0
9Audi A4 3.0 convertible 2dr1000000042490383253.0622020.027.03814.0105.0180.070.0

Last rows

vehicle_namesmallsporty_compactlarge_sedansports_carsuvwagonminivanpickupawdrwdretail_pricedealer_costengine_size_(l)cylhpcity_mpghwy_mpgweightwheel_baselenwidth
418Volvo S40 4dr1000000025135237011.9417022.029.02767.0101.0178.068.0
419Volvo S60 2.5 4dr1000001031745299162.5520820.027.03903.0107.0180.071.0
420Volvo S60 R 4dr1000001037560353822.5530018.025.03571.0107.0181.071.0
421Volvo S60 T5 4dr1000000034845329022.3524720.028.03766.0107.0180.071.0
422Volvo S80 2.5T 4dr1000001037885356882.5519420.027.03691.0110.0190.072.0
423Volvo S80 2.9 4dr1000000037730355422.9620820.028.03576.0110.0190.072.0
424Volvo S80 T6 4dr1000000045210425732.9626819.026.03653.0110.0190.072.0
425Volvo V400001000026135246411.9417022.029.02822.0101.0180.068.0
426Volvo XC700001001035145331122.55208NaNNaN3823.0109.0186.073.0
427Volvo XC90 T60010001041250388512.9626815.020.04638.0113.0189.075.0